A Self-Adaptive Reinforcement-Exploration Q-Learning Algorithm

نویسندگان

چکیده

Directing at various problems of the traditional Q-Learning algorithm, such as heavy repetition and disequilibrium explorations, reinforcement-exploration strategy was used to replace decayed ?-greedy in thus a novel self-adaptive (SARE-Q) algorithm proposed. First, concept behavior utility trace introduced proposed probability for each action be chosen adjusted according trace, so improve efficiency exploration. Second, attenuation process exploration factor ? designed into two phases, where first phase centered on second one transited focus from utilization, rate dynamically success rate. Finally, by establishing list state access times, current is adaptively number times accessed. The symmetric grid map environment established via OpenAI Gym platform carry out symmetrical simulation experiments (SA-Q) SARE-Q algorithm. experimental results show that has obvious advantages over algorithms average turning inside rate, with shortest planned route.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Self-Regulating Action Exploration in Reinforcement Learning

The basic tenet of a learning process is for an agent to learn for only as much and as long as it is necessary. With reinforcement learning, the learning process is divided between exploration and exploitation. Given the complexity of the problem domain and the randomness of the learning process, the exact duration of the reinforcement learning process can never be known with certainty. Using a...

متن کامل

Reinforcement Learning with Exploration

متن کامل

Adaptive Aggregation for Reinforcement Learning with Efficient Exploration: Deterministic Domains

We propose a model-based learning algorithm, the Adaptive Aggregation Algorithm (AAA), that aims to solve the online, continuous state space reinforcement learning problem in a deterministic domain. The proposed algorithm uses an adaptive state aggregation approach, going from coarse to fine grids over the state space, which enables to use finer resolution in the “important” areas of the state ...

متن کامل

Adaptive-Resolution Reinforcement Learning with Efficient Exploration in Deterministic Domains∗

We propose a model-based learning algorithm, the Adaptive-resolution Reinforcement Learning (ARL) algorithm, that aims to solve the online, continuous state space reinforcement learning problem in a deterministic domain. Our goal is to combine adaptive-resolution approximation scheme with efficient exploration in order to obtain fast (polynomial) learning rates. The proposed algorithm uses an a...

متن کامل

RTP-Q: A Reinforcement Learning System with Time Constraints Exploration Planning for Accelerating the Learning Rate

Reinforcement learning is an efficient method for solving Markov Decision Processes that an agent improves its performance by using scalar reward values with higher capability of reactive and adaptive behaviors. Q-learning is a representative reinforcement learning method which is guaranteed to obtain an optimal policy but needs numerous trials to achieve it. k-Certainty Exploration Learning Sy...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Symmetry

سال: 2021

ISSN: ['0865-4824', '2226-1877']

DOI: https://doi.org/10.3390/sym13061057